Search CORE

4,573 research outputs found

PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications

Author: Fan Guisheng
Hong Liang
Li Mingchen
Tan Pan
Tan Yang
Yu Huiqun
Zhou Ziyi
Publication venue
Publication date: 26/10/2023
Field of study

Large protein language models are adept at capturing the underlying evolutionary information in primary structures, offering significant practical value for protein engineering. Compared to natural language models, protein amino acid sequences have a smaller data volume and a limited combinatorial space. Choosing an appropriate vocabulary size to optimize the pre-trained model is a pivotal issue. Moreover, despite the wealth of benchmarks and studies in the natural language community, there remains a lack of a comprehensive benchmark for systematically evaluating protein language model quality. Given these challenges, PETA trained language models with 14 different vocabulary sizes under three tokenization methods. It conducted thousands of tests on 33 diverse downstream datasets to assess the models' transfer learning capabilities, incorporating two classification heads and three random seeds to mitigate potential biases. Extensive experiments indicate that vocabulary sizes between 50 and 200 optimize the model, whereas sizes exceeding 800 detrimentally affect the model's representational performance. Our code, model weights and datasets are available at https://github.com/ginnm/ProteinPretraining.Comment: 46 pages, 4figures, 9 table

arXiv.org e-Print Archive

Syntheses and luminescent properties of a series of new lanthanide azelates

Author: Fu Lianshe
Jiang Tingting
Tan Zhiwen
Zhou Jian
Zou Hua-Hong
Publication venue: 'Elsevier BV'
Publication date: 01/11/2020
Field of study

A series of new lanthanide azelates [Ln(aze)(Haze)(H2O)]·H2O {Ln = La (1a), Ce (1b), Pr (1c); H2aze = azelaic acid}, [Ln2(aze)3(phen)2]·H2O [Ln = Nd (2a), Er (2b); phen = 1,10-phenanthroline], [Sm(aze)(Haze)(phen)]·2H2O (3), [Gd(aze)(phen)2]·ClO4 (4) and (Hphen)[Tb2(aze)2(phen)4]·3ClO4 (5) were hydrothermally prepared and structurally characterized.1a-c are isostructural and show 3-D framework based on 1-D infinite [Ln-O-Ln]n chain. 2a-b exhibit sql layer, while 3 displays 1-D chain, where phen ligands locate at both sides of the chain. The Ln3+ ions of 4 and 5 are connected by aze2− into two different types of rare cationic 1-D chains. The luminescent investigations show that both 2a and 2b exhibit interesting NIR luminescence and 5 displays a good potentiality as a luminescent sensor targeted for Fe3+ ion. Of particular interest, lanthanide azelates have not been to date documented, while this work presents the only examples of lanthanide azelates exhibiting luminescent properties. The magnetic properties of some lanthanide azelates were also investigated.publishe

Repositório Institucional da Universidade de Aveiro

STEM teaching for the Internet of Things maker course: a teaching model based on the iterative loop.

Author: Chen Rongjun
Ren Jinchang
Tan Hong-Zhou
Xu Xiansheng
Zhao Huimin
Zheng Yani
Publication venue: 'MDPI AG'
Publication date: 17/07/2020
Field of study

As the key technology for 5G applications in the future, the Internet of Things (IoT) is developing rapidly, and the demand for the cultivation of engineering talents in the IoT is also expanding. The rise of maker education has brought new teaching inspiration for cultivating innovative technical talents in the IoT. In the IoT maker course, teaching problems include the lack of adequate teaching models, emphasis on products but less emphasis on theory, and letting students imitate practice. Focusing on these problems, this paper proposes a new Science, Technology, Engineering, and Mathematics (STEM) teaching model called Propose, Guide, Design, Comment, Implement, Display and Evaluate (PGDCIDE) for the IoT maker course. The PGDCIDE teaching model is based on STEM teaching and Kolodner's design-based scientific inquiry learning cycle model, and realizes the combination of "theory, practice, and innovation." Finally, this paper designs the IoT maker course to practice the PGDCIDE model. The practical results indicate that students significantly improved their emotional level, knowledge level, and innovation level after studying the course. Therefore, the PGDCIDE teaching model proposed in this paper can improve the effectiveness of the IoT maker course teaching and is conducive to the cultivation of students' sustainable ability in engineering education. It has reference significance for the application of maker courses in engineering education practice

University of Strathclyde Institutional Repository

Open Access Institutional Repository at Robert Gordon University

Fast restoration for out-of-focus blurred images of QR code with edge prior information via image sensing.

Author: Chen Rongjun
Ren Jinchang
Tan Hong-Zhou
Yu Yongxing
Zhao Huimin
Zheng Zhijun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/06/2021
Field of study

Out-of-focus blurring of the QR code is very common in mobile Internet systems, which often causes failure of authentication as a result of a misreading of the information hence adversely affects the operation of the system. To tackle this difficulty, this work firstly introduced an edge prior information, which is the average distance between the center point and the edge of the clear QR code images in the same batch. It is motivated by the theoretical analysis and the practical observation of the theory of CMOS image sensing, optics information, blur invariants, and the invariance of the center of the diffuse light spots. After obtaining the edge prior information, combining the iterative image and the center point of the binary image, the proposed method can accurately estimate the parameter of the out-of-focus blur kernel. Furthermore, we obtain the sharp image by Wiener filter, a non-blind image deblurring algorithm. By this, it avoids excessive redundant calculations. Experimental results validate that the proposed method has great practical utility in terms of deblurring quality, robustness, and computational efficiency, which is suitable for barcode application systems, e.g., warehouse, logistics, and automated production

Open Access Institutional Repository at Robert Gordon University

Learning Navigational Visual Representations with Semantic Map Supervision

Author: Bui Trung
Dernoncourt Franck
Gould Stephen
Hong Yicong
Tan Hao
Zhang Ruiyi
Zhou Yang
Publication venue
Publication date: 23/07/2023
Field of study

Being able to perceive the semantics and the spatial structure of the environment is essential for visual navigation of a household robot. However, most existing works only employ visual backbones pre-trained either with independent images for classification or with self-supervised learning methods to adapt to the indoor navigation domain, neglecting the spatial relationships that are essential to the learning of navigation. Inspired by the behavior that humans naturally build semantically and spatially meaningful cognitive maps in their brains during navigation, in this paper, we propose a novel navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps (Ego

^2

-Map). We apply the visual transformer as the backbone encoder and train the model with data collected from the large-scale Habitat-Matterport3D environments. Ego

^2

-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation. Experiments show that agents using our learned representations on object-goal navigation outperform recent visual pre-training methods. Moreover, our representations significantly improve vision-and-language navigation in continuous environments for both high-level and low-level action spaces, achieving new state-of-the-art results of 47% SR and 41% SPL on the test server

arXiv.org e-Print Archive